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We study the instrumental features of photons from the peak observed at E- 1 — 130 GeV in the 
spectrum of Fermi-LAT data. We use the sPlots algorithm to reconstruct - seperately for the 
photons in the peak and for background photons - the distributions of incident angles, the recorded 
time, features of the spacecraft position, the zenith angles, the conversion type and details of the 
energy and direction reconstruction. The presence of a striking feature or cluster in such a variable 
would suggest an instrumental cause for the peak. In the publically available data, we find several 
suggestive features which may inform further studies by instrumental experts, though the size of 
the signal sample is too small to draw statistically significant conclusions. 
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INTRODUCTION 

While the existence of dark matter is widely accepted, 
its particle nature remains undiscovered. Potential av- 
enues for discovery include observation of production at 
high energy accelerators, scattering with heavy nuclei in 
large low-noise underground volumes, or annihilation. 

A clear signal of dark matter annihilation may be car- 
ried by gamma rays traveling to Earth from regions in 
the galaxy of high dark-matter density. As they do not 
typically scatter after their production, the photon en- 
ergy and direction are powerful handles for understand- 
ing the mechanism of dark matter annihiliation into stan- 
dard model particles. 

One mechanism is annihilation resulting in quarks, 
which would hadronize and yield ir° particles which in 
turn produce photons. The spectrum of such a process 
would give fairly low energy photons (E 1 <« 50 GeV) 
which may be difficult to distinguish from other sources. 

A more striking feature may appear from annhilation 
directly into two-body final states including a photon. 
Rather than yielding a broad energy spectrum, this pro- 
cess would produce a photon with a well-defined energy 
given (for the process \X ~^ 7^0 by 
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where My is the mass of the second annihilation product, 
such as a Z boson or a second photon. For the case where 
Y = 7, the line occurs at the mass of the dark matter 
particle, E~ = m x . This makes a search for peaks in the 
photon spectrum an important component of the dark 
matter program using Fermi-LAT dataflS]. 

Recent studies have identified a feature in the gamma 
ray spectrum near E 1 — 130 GeV[l, 0| with a source 
close to the galactic center The line feature is not 

accompanied by a lower-energy continuum emission, as 
would be expected in many models of dark matter in- 
teraction Q. However, the large apparant significance 



of the feature has generated keen interest in exploring 
other, more mundane explanations, such as unconsidered 
features in the non-dark-matter background in the diffi- 
cult region of the galactic center, or instrumental effects 
in the Fermi-LAT detector. 

In this paper, we present a first study of the instrumen- 
tal characteristics of photons in the line feature, using the 
SPlots 0] algorithm to disentangle the two populations 
(background and peak). This allows us to reconstruct 
distributions in variables which may reveal instrumental 
issues that would not otherwise be apparant. 



SPLOTS 

In a sample of events with multiple sources, if one vari- 
able can be used to discriminate between the sources, the 
sPlots algorithm can reconstruct the statistical distri- 
bution of each of the sources in other variables, which we 
refer to as the 'unfolding variables'. sPlots uses only 
information from the discriminating variable and knowl- 
edge of the probability density functions (pdf) for each 
source in the discriminating variable. In addition, the al- 
gorithm assumes that the pdfs can be factorized between 
the discriminating and unfolding variables. 

For the purposes of clarity, we simplify the general 
sPlots formalism of Ref. [8( into the two-sources case 
we will apply to the Fermi-LAT data. 

Given pdfs for two sources fi(y), and f 2 (y) in the dis- 
criminating variable y, one can construct a histogram in 
another unfolding variable x using weights for each source 
class, sPi and sP 2 , defined as: 
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sP2(y) 



Vii/i(y) + Vi2/ 2 (y) 
Nih(y) + N 2 f 2 (y) 
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where iVi and N? are the number of events in each class, 
as extracted by a likelihood fit of /1 and f 2 to the ob- 
served distribution in y, and the inverse of the matrix V 
is a symmetric 2x2 matrix defined as 



v-, 1 = E 



(N 1 + N 2 )f a (y l )f b (y l ) 

+ ^ 2 / 2 (2/z)) 2 

A histogram /i in the unfolding variable a; can then be 
constructed for source 1 as 



hi = ^2sP 1 (y ji ) 



where i is the bin index in the x variable, Ni is the num- 
ber of events in that bin, and yji is the value of the y 
variable for the jth event in the zth bin. A histogram for 
source 2 would be constructed by replacing sP\ — ► SP2. 

This technique is superior to simply making a selection 
in the y variable to enhance the relative contribution of 
one source, which may still be significantly polluted by 
the other source. Note that if the two sources were com- 
pletely separable in the y variable, then the sPlot weights 
would reduce trivially to or 1/N, This appears to be 
the first application of this algorithm to an astrophysical 
problem [9(. 



were not used in reconstructing the unfolded distribu- 
tions. The sPlots algorithm is successfully able to dis- 
entangle the two sources and reveal the x-dependence of 
each. 

Note that since the distributions use a statistical un- 
folding (rather than event-by-event) and is unaware of 
physical constraints, it is possible to have a negative pre- 
diction in a bin. The statistical uncertainty in a given 
bin is AN = y/^sP 2 . 
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Example of sPlots in Toy Data 

As an illustrative example, we generate toy data from 
two sources according to the pdfs: 

V 27T OU 

and 

/non-peak (^i y) — 

where / pea k(a ; , y) is normally distributed in y with a = 1, 
while /non- P cak(a;, y) is uniform in y, providing good dis- 
crimination power. Our goal is to construct histograms 
which reveal the distribution in x for each of the two 
sources. 

Figure QJa) shows the generated event distribution in 
the discriminating variable y and the unfolding variable 
x using 1000 events from each source. Figure QJb) shows 
the projection in y and the result of the fit to extract N\ 
and N 2 , which uses only this one-dimensional projection 
and the y-dependence of the pdfs. 

The unfolded distributions for each source are shown 
in Figure [TJc,d), along with the true pdfs in x, which 



FIG. 1: Example application of the sPlots '&] algorithm in 
toy data. Top left, distribution of toy data in the discrimi- 
nating and unfolding variable. Top right, distribution of peak 
and non-peak events in the discriminating variable, with the 
fitted combined pdf. Bottom left (right) shows the non-peak 
(peak) distribution in the unfolding variable (blue points), 
with the true pdf (black line). The unfolded distributions are 
calculated by sPlots using only information from the dis- 
criminating variable. 

The unfolded distributions can be reconstructed just as 
well for non-linear pdfs, see the example in Fig. [Ha,b), 
where trigonometric functions have replaced the x depen- 
dence of the pdfs. 

Correlations between x and y in the pdfs can lead to 
biases, but do not catastrophically undermine the unfold- 
ing. For example, if we use 

y + (l-y/5)x 

/non-peak^) U ) j_q 

so that the slope in x varies from positive to negative 
over y e [0, 10]: 



/non-peak(^-5 0) — -p. i /non-peak^; 10) — ) 
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FIG. 2: Tests of the robustness of sPlots. In each case, 
blue points show the sPlots reconstructed distribution and 
the black line is the true pdf. Top, reconstruction of the 
unfolding variable with non-linear pdfs. Center and bottom, 
impact of correlations in the pdfs on the reconstruction of 
the unfolding variable. Center uses a slope which varies with 
the discriminating variable, averaging to zero. Bottom uses a 
variable-width peak. See text for details. 



then sPlots recovers the average x dependence for the 
non-peak source, see Fig. [^c,d), and the correct depen- 
dence for the events from the peak. 

If instead the width of the peak in y varies with x, 
there may be some biases introduced, but the effects are 
minor, even for a doubling of the peak width over the 
range y G [0, 10], see Fig^ei). 

Note that these types of correlations would be even 
more troublesome for the traditional strategy of making 
a selection to enhance or suppress one source. 



THE FERMI-LAT DATA SAMPLE 

We use the publically available data with the ex- 
tended photon data from the Fermi-LAT collaboration 
through June 28th, 2012, making standard quality re- 



quirements [lOj and examining a square region around 
the galactic center, with galactic longitude — 5 < I < 5 
degrees and galactic latitude — 5 < b < 5 with energy 
greater E 1 > 50 GeV. 

Other than the reconstructed energy, the photons have 
other measured characteristics [l3| which may give in- 
sight into instrumental effects: 

• incident angle 9, measured with respect to the top- 
face normal of the LAT, 

• azimuth angle 0, measured with respect to the top- 
face normal of the LAT, folded as described in Eq. 
(15) of Ref. [H|. 



• zenith angle, measured with respect to the zenith 
line, which passed through the earth and LAT's 
center of mass, 

• earth azimuth angle, the azimuthal angle relative 
to the same line as the zenith, defined such that 
zero indicates the photon came from the northern 
direction, 

• mission elapsed time, measured relative to January 
1, 2001, 

• conversion type (front or back), indicates whether 
the event induced pair production in the front 
(thin) layers or the back (thick) layers of the 
tracker, 

• the probability that the best energy chosen from 
the three energy estimators is correct, 

• the probability that the direction estimate is good, 

• ratio of true/raw energy, 

• first layer of the tracker with a hit, 

• the magnetic field in which the LAT is immersed, 
as parameterized by the Mcllwain B and L param- 



eters 



14]. 



• the distance from the cen ter of the South Atlantic 
anomaly, calculated as y/ Along 2 + Alat 2 in terms 
of Earth latitude and longitude, and 

• the geomagnetic latitude of the spacecraft. 

In this paper, we study the distribution in these vari- 
ables for signal-like and background-like photons. In 
some cases, a large difference in the distribution of signal- 
like and background-like photons would be a clear indica- 
tion of an instrumental issue. This is especially true for 
variables related to the spacecraft position, environment 
or angle (mission time, magnetic field, earth azimuth 
angle, distance from the SA anomaly, geomagnetic lat- 
itude). Other variables are connected to the quality or 
class of the reconstruction (incident angles, conversion 
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type, reconstruction details) and would give more subtle 
clues as to whether the feature is due to a sub-class of 
photons, or photons with particularly high or poor reso- 
lution. The response of the LAT is dependent on some of 
these variables. For example, the energy resolution is a 
function of the incident angle 9 and the conversion type, 
see Fig. [3] In addition, the point-spread function depends 
on the point in the LAT where the photon converts. 

These indications would be only the first clues, and 
would need detailed follow-up by the instrument experts; 
a complete study is not possible in the information avail- 
able in the public data. The Fermi-LAT collaboration 
has already performed detailed studies of the instrument 
performance and calibration, including studies of poten- 
tial systematic biases 



11, 15, 1 



Single-Line Analysis 

To analyze the features of the Fermi-LAT data using 
sPlots, we must define background and signal pdfs in 
the discriminating variable, Ey. The background pdf is 
a simple power-law: 



f hs (EJp,a)=l3 



E~ 



For the observed feature we assume a single line (the 
two-line hypothesis is discussed below) where the pdf 
/line (E^ l-Eunc) is defined according to the Fermi-LAT en- 
ergy dispersion tools definition [12] with a true photon 
energy of Eu ne , see Fig. [3] Applying these pdfs to the 
observed photon energy spectrum yields the fit seen in 

Fig.m 

Unfolded distributions of incidence angles are shown in 
Fig. El The distributions in galactic coordinates can be 
seen in Fig. |6] Zenith and azimuthal angle distributions 
are in Fig. [7J and the recorded time and conversion type 
are in Fig. [5] Energy and direction reconstruction quality 
are in Fig. [9] and the reconstructed/raw energy ratio as 
well as the first layer of the tracker with a hit are shown 
in Fig. 1101 The magnetic field parameters are shown in 
Fig, inland the distance from the South Atlantic anomoly 
and the geomagnetic latitude are shown in Fig. Q21 

In each case, we compare the distributions quantita- 
tively by calculating the x 2 /dof between the peak and 
background distributions, shown in Table Q] As the sig- 
nal and background weights are anti-correlated, this is 
calculated as 
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FIG. 3: Probability density function in Fermi-LAT recon- 
structed photon energy for photons with true energy of = 
130 GeV, for varying choices of the incident angle 9 and the 
conversion type [l|. 

One-line fit 
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FIG. 4: Energy of Fermi-LAT photons with signal plus back- 
ground fit, using a single-line hypothesis at Z? 7 = 130 GeV. 



estimate the expected variance of the measurement in 
each bin. Similar expressions apply for the background 
uncertainties. 



Double-Line Analysis 



(A, 
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Nl 
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(AAT 
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where Ap Cak is the sum of the weights s -Ppoak in that 
bin, and A Ap Cak is calculated from toy simulations which 



It has been suggested [17| that there may be two pho- 
ton lines, the 77 feature being accompanied by a feature 
due to jZ production, which would be at lower E 1 (see 
Eq (1)). 

We modify the signal pdf to include two lines, one at 
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FIG. 5: Disentangled signal and background distributions. 
Left, cos(#) where 9 is the photon incidence angle relative 
to a line normal the Fermi-LAT face. Right, 6, the photon 
incidence angle relative to the sun- facing side [la ] . 



FIG. 8: Disentangled signal and background distributions. 
Left, the mission elapsed time since Jan 1 2001 fl3l ]. Right, 
fraction of events in which the pair production is induced in 
the front (thin) or back (thick) layers of the tracker. 
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FIG. 6: Disentangled signal and background distributions in 
galactic coordinates, b (latitude) and I (longitude) [Tj|. No 
smoothing has been applied. 



FIG. 9: Disentangled signal and background distributions. 
Left, probability of correct energy reconstruction. Right, 
probability of correct angle reconstruction. 



110 GeV and one at 130 GeV (the results are not sensitive 
to the precise position of the second line). We allow 
the two line features to float independently, but in the 
sPlots analysis we treat them together as a single pdf 
once their relative normalization has been fixed by the 
fit. The result of the fit can be seen in Figure [13] Note, 
however, that systematic or instrumental issues which 
cause features in the energy spectrum at 110 GeV and 
130 GeV may not be manifested in the same regions of the 
instrumental variables, and so may not add coherently. 
Unfolded distributions of incidence angles are shown 
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FIG. 10: Disentangled signal and background distributions. 
Left, ratio of reconstructed to raw photon energy. Right, the 
first layer of the tracker with a hit. 
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FIG. 7: Disentangled signal and background distributions. 
Left, angle between the reconstructed photon direction and 
the zenith line, which passed through the earth and Fermi's 
center of mass. Right, the earth azimuth angle. 
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FIG. 11: Disentangled signal and background distributions. 
Left, magnetic field strength in terms of the Mcllwain B pa- 
rameter. Right, the Mcllwait L parameter. 
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FIG. 12: Disentangled signal and background distributions. 
Left, the distance in Earth longitude and latitude from the 
center of the South Atlantic Anomaly. Right, geomagnetic 
latitude of the spacecraft. 
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FIG. 13: Energy of Fermi-LAT photons with signal plus back- 
ground fit in the double-line analysis at = 110 GeV and 
EL = 130 GeV. 



in Fig. 03] The distributions in galactic coordinates can 
be seen in Fig. 1151 Zenith and azimuthal angle distribu- 
tions are in Fig ll6l and the recorded time and conversion 
type are in Fig [171 Energy and direction reconstruction 
quality are in Fig [TBI and the reconstructed/raw energy 
ratio as well as the first layer of the tracker with a hit 
are shown in Fig [19] The magnetic field parameters are 
shown in Fig. [501 and the distance from the South At- 
lantic anomoly and the geomagnetic latitude are shown in 
Fig-HU In each case, we compare the distributions quan- 
titatively by calculating the x 2 /dof between the peak and 
background distributions, shown in Table [U 



FIG. 14: Disentangled signal and background distributions in 
the double-line analysis. Left, cos(#) where is the photon 
incidence angle relative to a line normal the Fermi-LAT face. 
Right, (f>, the photon incidence angle relative to the sun-facing 
side Ell. 
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FIG. 15: Disentangled signal and background distributions in 
the double-line analysis in galactic coordinates, b (latitude) 
and I (longitude) [13]. No smoothing has been applied. 



elusions about the distributions above, we must under- 
stand whether we would expect to see a feature given the 
current statistics. 

To probe this question, we perform simulated experi- 
ments using a hypothetical variable in which the back- 
ground is uniformly distributed between and 1 and the 
signal peak is a delta function at 0.45; this represents an 
optimistic scenario in which the entire signal is tightly 
clustered. Figure [55] shows representative individual ex- 
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SENSITIVITY 

The number of events in the observed peak is not 
large, which makes the task of identifying a potential 
instrumental feature difficult. Before we can draw con- 



FIG. 16: Disentangled signal and background distributions 
in the double-line analysis. Left, angle between the recon- 
structed photon direction and the zenith line, which passed 
through the earth and Fermi's center of mass. Right, the 
earth azimuth angle. 
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FIG. 17: Disentangled signal and background distributions in 
the double-line analysis. Left, the mission elapsed time since 
Jan 1 2001 [TJ|. Right, fraction of events in which the pair 
production is induced in the front (thin) or back (thick) layers 
of the tracker. 



FIG. 21: Disentangled signal and background distributions. 
Left, the distance in Earth longitude and latitude from the 
center of the South Atlantic Anomaly. Right, geomagnetic 
latitude of the spacecraft. 
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FIG. 18: Disentangled signal and background distributions. 
Left, probability of correct energy reconstruction. Right, 
probability of correct angle reconstruction. 
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FIG. 19: Disentangled signal and background distributions. 
Left, ratio of reconstructed to raw photon energy. Right, the 
first layer of the tracker with a hit. 
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FIG. 20: Disentangled signal and background distributions. 
Left, magnetic field strength in terms of the Mcllwain B pa- 
rameter. Right, the Mcllwait L parameter. 



TABLE I: Summary of consistency between background and 
peak distributions for each of the considered instrumental 
variables, expressed as the \ 2 P er degree of freedom. 



Variable 


Single-line Double-line 




X 2 /dof 


X 2 /dof 


cos(0) 


8.9/5 


10.7/5 


Detector Azmith 


4.4/7 


7.9/7 


Zenith Angle 


4.3/7 


10.2/7 


Earth Azimuth 


1.1/4 


2.5/4 


Mission Time 


1.6/7 


2.7/7 


Conversion Type 


0.0/1 


0.0/1 


Prob correct energy 


6.8/7 


12.1/7 


Prob correct dir 


2.7/7 


6.0/7 


Reco/Raw energy 


11.9/6 


11.4/6 


First tracker hit 


2.4/7 


3.6/7 


Mcllwain B 


11.9/7 


11.9/7 


Mcllwain L 


2.5/7 


3.8/7 


Distance from SA Anomaly 


1.4/6 


1.9/6 


Geomagnetic Latitude 


2.6/7 


6.5/7 



ample experiments with either zero, 12 or 100 signal 
events. If the signal statistics were very large (-/V s j g = 100 
events), such a strong feature would be observable both 
as a discrepant single bin and a x 2 /d.o.f. with low prob- 
ability, P(x 2 /d-cf.= 34.6/7) = 10~ 5 . In the current 
statistics (N s \ s ~ 12 events), the feature would be no- 
ticeable in a single bin, but the x 2 /d.o.f., which analyzes 
the global consistency of the two distributions, would be 
reasonable, P(x 2 /d.o.f.= 7.7/7) = 0.36. 

In the instrumental features we study here, this sce- 
nario may be overly optimistic - a real instrumental fea- 
ture may appear as a more subtle difference between the 
two distributions. It may also be pessimistic, as the sig- 
nal feature could appear where the background is sup- 
pressed, whereas in the hypothetical variable the back- 
ground is uniform. However, the simulated experiments 
suggest that even if there were a true strong instrumen- 
tal disagreement between the signal-like and background- 
like photons, we may identify one or two discrepant bins, 
but are unlikely to find a x 2 /dof with convincingly small 
probability. This emphasizes our earlier point, that an 
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FIG. 22: Disentangled signal and background distributions 
in simulated experiments hypothetical variable in which the 
background is uniform and the signal is a delta function at 
x — 0.45, for varying amounts of signal. Top, no signal events; 
center, 12 signal events, the approximate number seen in the 
Fermi-LAT data; bottom, 100 signal events. 



DISCUSSION 

Examining the unfolded distributions, there are several 
bins which show suggestive but inconclusive discrepan- 
cies. The distribution of cos(fj) (Fig. [5]) shows a bin with 
a large fraction of the signal events near cos(8) = 0.7. 
This is unlikely to cause a feature in the energy spec- 
trum, though the resolution depends on cos(#), as the 
cluster occurs at the median value rather than at ei- 
ther extreme. The overall consistency is reasonable, 
P(x 2 /d.o.f.= 8.9/5) = 0.11, though see the sensitivity 
discussion above. Similarly, there is a single discrepant 
bin in the Mcllwain B parameter at 1.65 Guass fFig.lTTj). 
These may be useful clues for further instrumental stud- 
ies. 



CONCLUSIONS 

We have performed an initial study of the instrumental 
characteristics of events from the feature at E 1 = 130 
GeV observed in the Fermi-LAT data. 

In the instrumental variables available in the public 
data distribution, we find no conclusive difference in char- 
acteristics between peak photons and background pho- 
tons, see Table Q] There are several suggestive discrep- 
ancies, near cos(f5) = 0.7 or Mcllwain B parameter of 
1.65 Guass which deserve further study by instrumental 
experts. 

There are several additional instrumental variables 
which should be examined, such as the incident position 
on the face of the LAT, but are not available in the public 
data. 

If a striking feature had appeared - such as a cluster- 
ing of the peak photons at a given time or near a specific 
angle of incidence - it would have pointed to an instru- 
mental issue. The statistics of the sample are too poor to 
draw strong conclusions, but the lack of a very clear fea- 
tures makes an instrumental explanation somewhat less 
likely. 



observed discrepancy in the distribution of signal-like and 
background-like photons should serve as a clue for further 
instrumental studies, rather than conclusive evidence for 
or against an instrumental explanation. 

As a positive control, we can examine the galactic lon- 
gitude. The feature at E 1 = 130 GeV has been previ- 
oulsy localized to / = —1.5° |6J, which is consistent with 
what we observe in Fig [6] While the individual bin near 
/ = —1.5° shows a large discrepancy between signal-like 
and background-like photons, the global agreement of the 
distributions in longitude is reasonable, with a p- value of 
0.3. 
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